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1. INTRODUCTION 

The advancement in sensing mechanisms it is observed that automated sensing techniques are 
evolved in research state of art [1]. In order to capture the various informations based on different kinds of 
commercial products that work works on advanced sensing technologies [2]. The human detection system is 
one of the sensing techniques that utilize diffent kinds of sensors [3]. The human activity detection systems 
are applicable for human behavior form a given scene. Also, these systems are used to interface the human 
activities with a system and perform a specific task. But, most of the applications demands better analysis of 
complex motions of human and have user friendly interface with the system. Most of the data captured from 
visual sensors are in the form of depth maps, RGB data, nodal points as joints of skeletal, etc [4]. Kinect 
sensor is an action detection approach that offers all the forms of information from the input image in 
different combination [5]. More number of researches have considered Kinect sensor to identify different set 
of problems [6]-[10]. All these informations are acts as input for applications like video surveillance system, 
computer and human interaction; contextual attribute based multimodal retrieval, etc. It is also observed that 
various kinds of abstraction are existing to perform human activity analysis and is also known as activity, 
gesture, action, etc. These approaches can be varied with potential features. Basically, gestures can be termed 
as primary motion and is highly specific to the part of human body. However, action relates to activities 
undertaken by one perform with combination of different gestures. The action is also involved with temporal 
factor associated with generation of multiple gestures and represents movement of complete human body. 
However, from computer vision prospective, the recognition system for both action and gesture are found to 
be interchangeably utilized over different set of recognition problem. The fundamental methodology applied 
for recognition of human action is to obtain the features from the motion aspect of video or image sequences 
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in order to facilitating forecasting of one or specific set of actions. However, one of the bigger set of 
dependencies associated with recognition of human activity is that of feature extraction process. More the 
number of features will always ensure good accuracy of recognition system. However, such mechanism of 
capturing higher number of feature will call for two potential problems e.g. i) large processing time and ii) 
high resource utilization. Both of these two points are detrimental to ensure computational performance of 
system. Although, an significant growth in research-based techniques related to human activity recognition is 
exist, but decision to selections effective features using cost effective computational model is highly 
overlooked. Hence, there is a need of a system which can ensure the significant identification for given input. 
This paper introduces a novel mechanism which is cost-effective identification of features in the form of 
significant joints considering skeletal modalities for effective design of recognition system of human actions. 

The recent past has witnessed lot of researches towards addressing the human activity recognition 
system related problems. The current section gives some of the recent researches performed in this field. In 
the work of Chen et al. [11] and [12] given a wavelets based algoritm which helps to identify and classify the 
detected human activity by using supervised learning and training approach for training. In De et al. [13], a 
dictionary learning-based technique by utilizing sparse signal representation for activity recognition. A 
qualitative and experimental-based human activity analysis by Fullerton et al. [14] in which k-nearest 
neighboring classifier is used to enhance accuracy score. Author Gavrilova et al. [8] performed an 
investigation on similar activity from the v sensor. In, Hbali et al. [15] skeleton based approach is utilized for 
joints for similar activity recognition system. Futher, Bayesian based approach is introduced in Hernandez et 
al. [16] to construct a segmentation technique for gait recognition system. The work of Jain and Kanhangad 
[17] has used gyroscope and accelerometers to address the classification problems in human activity 
recognition system. Correlation-based approach was found in Khan et al. [18] to minimize the dimensional 
attributes involved in it along with usage of feature vector. Manzi et al. [19] have offered a Skeleton-based 
human activity identification system. A unique context towards approach is given in Noor and Uddin [20] to 
improve the accuracy level of the activity recognition by using neural network based training. Further, 
Savvaki et al. [21] utilized a hankel structure to represent image streams for facilitating better classification 
performance. Sikder and Sarkar [22] utilized a distance-based approach on motion data along with linear 
regression approach to identify the dynamic human activities and achieved higher accuracy in its 
classification performance. The work of Ulhagq et al. [23] used a space-time correlation based mechanism for 
three dimensional tensor structures to perform identification of the human actions. With the experimentation 
on video dataset it has achieved increased accuracy. Vishwakara and Singh [24] used an energy-factors 
associated with silhouette image which uses temporal contents of the dataset using transform-based approach. 
The work associated with Wang et al. [25] gives context-based methodology based on predictive approach 
for activity monitoring system. The network channel based human activity identificant system was discussed 
in Wang et al. [26]. The work of Xu et al. [27] given hierarchical-based approach using both distance and 
time for human activity monitoring system. In, Yang et al. [28] super normal vector is used to perform 
aggregation of the different discriminative variables for identification of human activity. The human motion 
tracking is given in Zhao et al. [29]. Hence, there are various cadres of recent techniques towards human 
activity recognition system. The next section discusses about problems encountered in the above discussed 
existing literatures. 

The research problems associated with activity identification system are as follows: 

a. The existing studies doesn’t emphasize on computational complexities associated with the selection of 
large number of features from a given set of data of human actions. 

b. The possibility of minimizing the computational effort of feature extraction using joint-based attribute in 
skeleton system has achieved less attention among research communities. 

c. At present, there are very less number of literatures dedicated for exploring an effective number of joints 
responsible for performing recognition of human activity. 

d. Optimization of usage of depth map has less number of researches and more number of research are 
towards applying machine learning to offer higher precision. 

Therefore, the problem statement of this work is "To develope a cost effective system to minimize 
the computational effort of feature extraction to leverage the performance of recognition system of human 
action is computationally challenging”. 

The proposed system aims to ensure a computationally cost effective mechanism to identify the 
significant joints in human activity recognition system. The significance of this mechanism is that the 
proposed system targets to achieve a significant balance among higher accuracy in recognizing human 
activities with extremely low computational efforts. The computational effort minimization is only feasible if 
the solution aims for better form of feature extraction mechanism. The pictorial representation of Figure 1 
indicates the proposed system. 
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Figure 1. Proposed system 


With the implementation of the analytical research approach in the proposed system, the input 
image sequences are considered with defined standard human actions. Then the depth map of the image will 
be extracted from input image dataset which is further resumed for projected segments extraction on three 
different planes of x-y-z axis. An algorithm is developed to asses all the joints of skeleton of depth image by 
using identity-based attributes. The process outcomes with effective number of joints which helps in 
significant way of human activity recognition with higher accuracy. The proposed system offers a faster joint 
processing irrespective of any selection of motion patterns as well as it doesn’t offer any form of 
dependencies towards computing unwanted number of joints for carrying of recognition of an effective 
human action. The following section idealizes the algorithm implementation of proposed system. 


2. ALGORITHM IMPLEMENTATION 

The algorithm is responsible to read the input and apply a simple processing to identify the 
significant joints. This algorithm addresses the problems associated with the different literature and it is also 
observed that there is a greater amount of dependencies towards using feature extractor mainly when image 
area is quite large. This results in increased computational complexity. Moreover, it was also explored that 
level of precision is very specific to different kinds of dataset and is found quite low. Hence, the developed 
algorithm utilizes the extraction concept for significant regions of the image area. The targeted advantage is 
to obtain inclusion of lesser image are involved in extraction of feature. It is because not all the joint aspect 
information is useful to perform the human activity identification. Following are the steps involved in 
proposed algorithm: 


Algorithm for Extracting Significant Joints 
Input: a, j, J 

Output: Djoint 

Start 

. init a, j, J 

. 0 Dread(aia, Sia, €ia); 

. dma > f0) 

. C>g(0) 

. Extract (x, y, z) from C 
. For s=1: (X) 

. S>[X, Y, Z] 

. End 

. For s=2:(X) 

10. [Sp, SJ>[X, Y, Z] 
11. Djoin > [Euc(S)] 

12. End 

End 


OMANADMAHWNH 
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The algorithm of the proposed system considers the input of a (set of activity), j (joint tag), 
J (connection of skeleton joints) that after processing results in output of Djoin (Display of joints) (Line-1). 
The implementation is related to defining the action set as well as name of joints as following. Figure 2 show 
the defined set of activity considered for human activity recognition and Figure 3 show the defined name of 
joints considered for human activity recognition. 


Figure 2. Defined set of activity considered for human activity recognition 


4 
ğ : 
EN PT 
i 8 | 7 
i aN i 8 


“=a 


Figure 3. Defined name of joints considered for human activity recognition 


The next implementation is associated with defining the identities associated with name of action 
(aia), identity of the subject (sia), identity of the example (eia). All these identity-related information is read 
and stored in a matrix 0 (Line-2). Thus, it can be said that a data file is constructed on the basis of this 
identities and are used to extract depth information from the given image file. Further, a new function f(x) is 
implemented which takes the input of location of all the image file and result in depth image as an outcome. 
The function takes the input of the newly obtained matrix 0 (Line-3) that finaly resuts in an array of depth 
matrix dma (Line-3). The input J that is initialized in Line-1 is basically used for specifying the connections 
among all the joints of skeleton followed by construction an input file on the basis of an action. A simple 
normalization process of all the coordinates were carried out followed by reshaping all the coordinates in 
order to obtain four row elements along with transposition operation. It is further followed by reshaping to 
total number of joints. 

The complete process of extracting the coordinates from the input image is highlighted in Figure 4. 
According to this process, the algorithm reads the identity-based information and constructs a new file, which 
is checked for its non-zero elements. In case of non-zero elements, the algorithm constructs a data B from the 
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text file using a function g (Line-4) followed by series of matrix-processing functions func to the obtained 
matrix B that finally results in a temporary matrix C (Line-4). The matrix C is used for obtaining all the 
coordinate information (Line-5). Then, depth is visualized following by evaluating the skeleton of each 
associated frames. The algorithm reads all the frames considering the X coordinates (Line-6) in order to 
obtain s™ frame in all the three directions (Line-7). The next part of the algorithm computes all the frames in 
order to compute the displacemet of the joint coordinates (Line-9). A new matrix Spis constructed in order to 
extract all the coordinates of the prior frame as well as coordinates of the present frame i.e. S (Line-10). This 
operation is followed by computation of a Euclidean distance between the prior and current frame i.e. S, and 
S that gives the final displacement. This computation is carried out for all the frames so that the algorithm 
could dynamically compute the joints and display it for all the frames (Line-11). Therefore, the algorithm 
exhibits a most simplest and cost effective mechanism for significant joints identification which are the 
primary backbone information required for every application of human activity recognition system. The 
analysis of the obtained results from this algorithm is given below. 
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Figure 4. Process of obtaining coordinates 


3. RESULT ANALYSIS 

The proposed system is implemented in MATLAB by considering the input dataset from MSR 
Action 3D dataset [30]. The algorithm is scripted and evaluated using 20 action sets which targets for precise 
identification of effective joints. Hence, it is individually assessed using performance parameters of 
displacement and standard deviation with respect to each discrete human action defined in the dataset. 
By monitoring displacement, it could be possible to assess how uniquely the proposed system is capable of 
identifying the body in motion and by monitoring standard deviation; it is feasible to determine the 
significant joints. Table 1 gives only 3 sample visual outcomes obtained from proposed system. 
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Table 1. Sample of Visual Outcomes Obtained (continue) 
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Apart from the individual outcome, the proposed system is also benchmarked using standard 
training algorithm access its accuracy performance. The benchmarking process is as follows: The proposed 
technique is allowed for training using two frequently used classifiers e.g. K-nearest neighbor algorithm 
(KNN) and Support Vector Machine (SVM). The analysis of accuracy is performed by considering 
Sensitivity and Specificity as performance parameter with respect to increasing training ratio. 
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The Figure 5 and Figure 6 indicates the proposed system which gives higher accuracy (with respect 
to specificity and sensitivity) when these are trained using KNN algorithm for all the significant joints. From 
this outcome, it can be said that it is absolutely not necessary to select al the joints for performing human 
activity as identification process is performed by using only the selected joints offer better accuracy 
performance. The performance of accuracy using KNN is much significantly better as compared to 
conventional SVM algorithm. 
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Figure 6. Analysis of Specificity 


4. CONCLUSION 
This paper gives a very simple and yet novel approach of enhancing the human activity recognition 


system performance. The prime basis of the work carried out is that existing mechanism performs similar 
objective by selection all the features in either form of depth map image or skeleton image, whereas, the 
proposed theory imposes that it could strike a computational complexity by considering so many feature 
points. This problems can be sorted if a mechanism is designed that could identify only the significant points 
in the skeletion image. The proposed study is reflected with significant outcomes that indicates higher 


accuracy performance when trained using KNN. 
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